Production I/O Characterization on the Cray XE6
نویسندگان
چکیده
I/O performance is an increasingly important factor in the productivity of large-scale HPC systems such as Hopper, a 153,216 core Cray XE6 system operated by the National Energy Research Scientific Computing Center. The scientific workload diversity of such systems presents a challenge for I/O performance tuning, however. Applications vary in terms of data volume, I/O strategy, and access method, making it difficult to consistently evaluate and enhance their I/O performance. We have adapted the Darshan I/O characterization tool for use on Hopper in order to address this challenge. Darshan is an I/O instrumentation library that collects I/O access pattern information from large-scale production applications with minimal overhead. In this paper we present our experiences in deploying Darshan on the Cray XE6 platform, including performance evaluation of Darshan with up to 98,304 processes and a case study of how to identify applications that can benefit most from I/O performance tuning. Darshan was automatically enabled for all Hopper users in November 2012 and instruments over 5,000 jobs per day as of April 2013.
منابع مشابه
Trillion Particles , 120 , 000 cores and 350 TBs : Lessons Learned from a Hero I / O Run on Hopper *
Modern petascale applications can present a variety of configuration, runtime, and data management challenges when run at scale. In this paper, we describe our experiences in running VPIC, a large-scale plasma physics simulation, on the NERSC production Cray XE6 system Hopper. The simulation ran on 120,000 cores using ∼80% of computing resources, 90% of the available memory on each node and 50%...
متن کاملTuning Parallel I/O on Blue Waters for Writing 10 Trillion Particles
Large-scale simulations running on hundreds of thousands of processors produce hundreds of terabytes of data that need to be written to files for analysis. One such application is VPIC code that simulates plasma behavior such as magnetic reconnection and turbulence in solar weather. The number of particles VPIC simulates is in the range of trillions and the size of data files to store is in the...
متن کاملCharacterizing I/O Performance Using the TAU Performance System
TAU is an integrated toolkit for performance instrumentation, measurement, and analysis. It provides a flexible, portable, and scalable set of technologies for performance evaluation on extreme-scale HPC systems. This paper describes alternatives for I/O instrumentation provided by TAU and the design and implementation of a new tool, tau_gen_wrapper, to wrap external libraries. It describes thr...
متن کاملA File System Utilization Metric for I/O Characterization
A high performance computing (HPC) platform today typically contains a scratch high-performance parallel file system for data storage. Today, such file systems encompass 10-20% of the purchase price of a HPC resource. Looking forward, it is apparent that the rate of increase of hard drive performance will not keep up with the expected gains in processing, and therefore any effort to keep I/O pe...
متن کاملTransitioning Users from the Franklin XT4 System to the Hopper XE6 System
The Hopper XE6 system, NERSC’s first peta-flop system with over 153,000 cores has increased the computing hours available to the Department of Energy’s Office of Science users by more than a factor of 4. As NERSC users transition from the Franklin XT4 system with 4 cores per node to the Hopper XE6 system with 24 cores per node, they have had to adapt to a lower amount of memory per core and onn...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013